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Abstract 



We present a linear-system solver that, given an n-by-n symmetric positive semi-definite, diagonally 
dominant matrix A with m non-zero entries and an n-vector b, produces a vector x within relative distance 
t of the solution to Ax = b in time 0{m^'^^ log(nK/(^)/e)'^'-^'), where nf{A) is the log of the ratio 
of the largest to smallest non-zero eigenvalue of A. In particular, \og{K,f{A)) = 0(61ogn), where h is 
the logarithm of the ratio of the largest to smallest non-zero entry of A. If the graph of A has genus m 
or does not have a K^e minor, then the exponent of m can be improved to the minimum o/ 1 + 55 and 
(9/8) (1 + 5). The key contribution of our work is an extension ofVaidya's techniques for constructing and 
analyzing combinatorial preconditioners. 



1 Introduction 

Sparse linear systems are ubiquitous in scientific computing and optimization. In this work, we develop fast 
algorithms for solving some of the best-behaved linear systems: those specified by symmetric, diagonally 
dominant matrices with positive diagonals. We call such matrices PSDDD as they are positive semi-definite 
and diagonally dominant. Such systems arise in the solution of certain elliptic differential equations via the 
finite element method, the modeling of resistive networks, and in the solution of certain network optimization 
problems IISF73I IMcC87l IH Y8 1 1 IW62I I You? 1 1 . 

While one is often taught to solve a linear system Ax = b by computing and then multiplying A^^ 
by 6, this approach is quite inefficient for sparse linear systems — the best known bound on the time required 
to compute A^^ is 0(n^ '^^^) ICW82I and the representation of A^^ typically requires n{n^) space. In 
contrast, if A is symmetric and has m non-zero entries, then one can use the Conjugate Gradient method, as a 
direct method, to solve for A^-^ b in 0{nm) time and 0{n) space! Until Vaidya's revolutionary introduction 
of combinatorial preconditioners IVai90l . this was the best complexity bound for the solution of general 
PSDDD systems. 

'Partially supported by NSF grant CCR-01 12487. spielmanSmath . mit . edu 
tPartially supported by NSF grant CCR-9972532. steng@cs.bu . edu 
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The two most popular families of methods for solving Unear systems are the direct methods and the iterative 
methods. Direct methods, such as Gaussian elimination, perform arithmetic operations that produce x treat- 
ing the entries of A and b symbolically. As discussed in Section [l~4l direct methods can be used to quickly 
compute X if the matrix A has special topological structure. 

Iterative methods, which are discussed in Section fTTSl compute successively better approximations to x. The 
Chebyshev and Conjugate Gradient methods take time proportional to myjKf{A) \og{Kf{A)/e) to produce 
approximations to x with relative error e, where k f (A) is the ratio of the largest to the smallest non-zero 
eigenvalue of A. These algorithms are improved by preconditioning — essentially solving B^^Ax = B^^b 
for a pre conditioner B that is carefully chosen so that Kf{A,B) is small and so that it is easy to solve 
linear systems in B. These systems in B may be solved using direct methods, or by again applying iterative 
methods. 

Vaidya IVai90l discovered that for PSDDD matrices A one could use combinatorial techniques to construct 
matrices B that provably satisfy both criteria. In his seminal work, Vaidya shows that when B corresponds 
to a subgraph of the graph of A, one can bound Kf{A,B)hy bounding the dilation and congestion of the best 
embedding of the graph of A into the graph of B. By using preconditioners derived by adding a few edges 
to maximum spanning trees, Vaidya's algorithm finds e-approximate solutions to PSDDD linear systems of 
maximum valence d in time 0((dn)^-^^ \og{K f (A) / e)) . ' When these systems have special structure, such as 
having a sparsity graph of bounded genus or avoiding certain minors, he obtains even faster algorithms. For 
example, his algorithm solves planar linear systems in time 0{{dn)^-^ log(K / {A) /e)). This paper follows the 
outline established by Vaidya: our contributions are improvements in the techniques for bounding Kf{A,B),a 
construction of better preconditioners, a construction that depends upon average degree rather than maximum 
degree, and an analysis of the recursive application of our algorithm. 

As Vaidya's paper was never published^, and his manuscript lacked many proofs, the task of formally working 
out his results fell to others. Much of its content appears in the thesis of his student. Anil Joshi |Jos97| . 
Gremban, Miller and Zagha lGre96l IGMZ95I explain parts of Vaidya's paper as well as extend Vaidya's 
techniques. Among other results, they found ways of constructing preconditioners by adding vertices to the 
graphs and using separator trees. 

Much of the theory behind the application of Vaidya's techniques to matrices with non-positive off-diagonals 
is developed in |BGH+ 1. The machinery needed to apply Vaidya's techniques directly to matrices with pos- 
itive off-diagonal elements is developed in |BCHTT. The present work builds upon an algebraic extension 
of the tools used to prove bounds on Kf{A, B) by Boman and Hendrickson |BH|. Boman and Hendrick- 
son IBHOll have pointed out that by applying one of their bounds on support to the tree constructed by Alon, 
Karp, Peleg, and West F AKPW95I for the k -server problem, one obtains a spanning tree preconditioner B 
with Kf{A, B) = m2°(^'°s"'°s'°sn). They thereby obtain a solver for PSDDD systems that produces e- 
approximate solutions in time jn^-^+oW log(K/(A)/e). In their manuscript, they asked whether one could 
possibly augment this tree to obtain a better preconditioner. We answer this question in the affirmative. An 
algorithm running in time 0{mn}/^ log^(n)) has also recently been obtained by Maggs, et. al. I MMP+02J . 

The present paper is the first to push past the 0{^l}^^) barrier. It is interesting to observe that this is exactly 
the point at which one obtains sub-cubic time algorithms for solving dense PSDDD linear systems. 

Reif IIRei98l proved that by applying Vaidya's techniques recursively, one can solve bounded-degree planar 
positive definite diagonally dominant linear systems to relative accuracy e in time 0{m}^°'^^^ \og{K{A) /()). 
We extend this result to general planar PSDDD linear systems. 

Due to space limitations in the FOCS proceedings, some proofs have been omitted. These are being gradually 
included in the on-line version of the paper. 

' For the reader unaccustomed to condition numbers, we note that for an PSDDD matrix A in which each entry is specified using b 
bits of precision, \og(Kf(A)) = 0(f<logn). 

^Vaidya founded the company Computational Applications and System hitegration (http://www.casicorp.com) to market his linear 
system solvers. 
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1.1 Background and Notation 



A symmetric matrix A is semi-positive definite if Ax > for all vectors x. This is equivalent to having 
all eigenvalues of A non-negative. 

In most of the paper, we will focus on Laplacian matrices: symmetric matrices with non-negative diagonals 
and non-positive off-diagonals such that for all i, J^j — 0- However, our results will apply to the more 
general family of positive semidefinite, diagonally dominant (PSDDD) matrices, where a matrix is diagonally 
dominant if \Ai^i\ > J2]=i l^j.j l for We remark that a symmetric matrix is PSDDD if and only if it is 
diagonally dominant and all of its diagonals are non-negative. 

In this paper, we will restrict our attention to the solution of linear systems of the form Ax = b where A is a 
PSDDD matrix. When A is non-singular, that is when A^^ exists, there exists a unique solution x — A~^b 
to the linear system. When A is singular and symmetric, for every b £ Span {A) there exists a unique 
X E Span (A) such that Ax = b. If A is the Laplacian of a connected graph, then the null space of A is 
spanned by 1. 

There are two natural ways to formulate the problem of finding an approximate solution to a system Ax — b. 
A vector x has relative residual error e if \\Ax — b\\ < e||b||. We say that a solution x is an e-approximate 
solution if it is at relative distance at most e from the actual solution — that is, if ||a; — £|| < e ||a;||. One 
can relate these two notions of approximation by observing that relative distance of x to the solution and the 
relative residual error differ by a multiplicative factor of at most Kf{A). We will focus our attention on the 
problem of finding e-approximate solutions. 

The ratio Kf{A) is the finite condition number of A. The I2 norm of a matrix, ||^||, is the maximum of 
II Aa;|| / ||a;||, and equals the largest eigenvalue of A if A is symmetric. For non-symmetric matrices, \max{A) 
and ll^ll are typically different. We let \A\ denote the number of non-zero entries in A, and min(A) and 
max{A) denote the smallest and largest non-zero elements of A in absolute value, respectively. 

The condition number plays a prominent role in the analysis of iterative linear system solvers. When A is 
PSD, it is known that, after y/Kf{A) log(l/e) iterations, the Chebyshev iterative method and the Conjugate 
Gradient method produce solutions with relative residual error at most e. To obtain an e-approximate solu- 
tion, one need merely run \og{Kf{A)) times as many iterations. If A has m non-zero entries, each of these 
iterations takes time 0{m). When applying the preconditioned versions of these algorithms to solve systems 
of the form B^^Ax = B^^b, the number of iterations required by these algorithms to produce an e-accurate 
solution is bounded by ^Kf{A, B) \og{K,f{A) / e) where 

x^Ax \ ( x^Bx 



, . , ( x^ Ax \ ( x^Bx 

Kf(A,B)= max ^ ^ max ^j^- — 

■' \x:Ax^O X^ BX J \x:Ax^O X^ Ax 



for symmetric A and B with Span {A) = Span (B). However, each iteration of these methods takes time 
0{m) plus the time required to solve linear systems in B. In our initial algorithm, we will use direct methods 
to solve these systems, and so will not have to worry about approximate solutions. For the recursive applica- 
tion of our algorithms, we will use our algorithm again to solve these systems, and so will have to determine 
how well we need to approximate the solution. For this reason, we will analyze the Chebyshev iteration 
instead of the Conjugate Gradient, as it is easier to analyze the impact of approximation in the Chebyshev 
iterations. However, we expect that similar results could be obtained for the preconditioned Conjugate Gra- 
dient. For more information on these methods, we refer the reader to liGV89J or LBru95J . 

1.2 Laplacians and Weighted Graphs 

All weighted graphs in this paper have positive weights. There is a natural isomorphism between weighted 
graphs and Laplacian matrices: given a weighted graph G — {V, E, w), we can form the Laplacian matrix in 
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which Aij = —w{i, j) for {i, j) e E, and with diagonals determined by the condition Al = 0. Conversely, 
a weighted graph is naturally associated to each Laplacian matrix. Each vertex of the graph corresponds to 
both a row and column of the matrix, and we will often abuse notation by identifying this row/column pair 
with the associated vertex. 

We note that if Gi and G2 are weighted graphs on the same vertex set with disjoint sets of edges, then the 
Laplacian of the union of Gi and G2 is the sum of their Laplacians. 

1.3 Reductions 

In most of this paper we just consider Laplacian matrices of connected graphs. This simplification is enabled 
by two reductions. 

First, we note that it suffices to construct preconditioners for matrices satisfying Aij — for *■ 

This follows from the observation in llBGH+l l that if A — A + D, where A satisfies the above condition, then 
Kf{A, B + D) < Kf{A,B). So, it suffices to find a preconditioner after subtracting off the maximal diagonal 
matrix that maintains positive diagonal dominance. 

We then use an idea of Gremban IGre96l for handling positive off-diagonal entries. If A is a symmetric 
matrix such that for all i, Ai^i > J2j l^ij l' then Gremban decomposes A into D + An + Ap, where D is 
the diagonal of A, A^ is the matrix containing all negative off-diagonal entires of A, and Ap contains all the 
positive off-diagonals. Gremban then considers the Unear system 



D + An Ap 




X 




■ b 


Ap D + An 




x' 




-b 



and observes that its solution will have x' = —x and that x will be the solution to Ax = b. Thus, by making 
this transformation, we can convert any PSDDD linear system into one with non-negative off diagonals. 
One can understand this transformation as making two copies of every vertex in the graph, and two copies of 
every edge. The edges corresponding to negative off-diagonals connect nodes in the same copy of the graph, 
while the others cross copies. To capture the resulting family of graphs, we define a weighted graph G to be 
a Gremban cover if it has 2n vertices and 

• for i,j < n, G i? if and only if {i + n,j + n) £ E, and w{i,j) — w{i + n.j + n), 

• for i,i < n, {i, j + n) E E if and only if {i + n,j) e E, and w{i,j + n) = w(i + n, j), and 

• the graph contains no edge of the form + n). 

When necessary, we will explain how to modify our arguments to handle Laplacians that are Gremban covers. 

Finally, if A is the Laplacian of an unconnected graph, then the blocks corresponding to the connected 
components may be solved independently. 

1.4 Direct Methods 

The standard direct method for solving symmetric linear systems is Cholesky factorization. Those unfamiliar 
with Cholesky factorization should think of it as Gaussian elimination in which one simultaneously eliminates 
on rows and columns so as to preserve symmetry. Given a permutation matrix P, Cholesky factorization 
produces a lower-triangular matrix L such that LL^ = PAP^. Because one can use forward and back 
substitution to multiply vectors by L^^ and in time proportional to the number of non-zero entries in L, 
one can use the Cholesky factorization of A to solve the system Ax = 6 in time 0(|i|). 
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Each pivot in the factorization comes from the diagonal of A, and one should understand the permutation P 
as providing the order in which these pivots are chosen. Many heuristics exist for producing permutations P 
for which the number of non-zeros in L is small. If the graph of A is a tree, then a permutation P that orders 
the vertices of A from the leaves up will result in an L with at most 2n — 1 non-zero entries. In this work, 
we will use results concerning matrices whose sparsity graphs resemble trees with a few additional edges and 
whose graphs have small separators, which we now review. 

If B is the Laplacian matrix of a weighted graph {V, E, w), and one eliminates a vertex a of degree 1, then 
the remaining matrix has the form 

[10 

Ai, 

where Ai is the Laplacian of the graph in which a and its attached edge have been removed. Similarly, if a 
vertex a of degree 2 is eliminated, then the remaining matrix is the Laplacian of the graph in which the vertex 
a and its adjacent edges have been removed, and an edge with weight l/(l/u>i + I/W2) is added between 
the two neighbors of a, where wi and W2 are the weights of the edges connecting a to its neighbors. 

Given a graph G with edge set E = RU S, where the edges in R form a tree, we will perform a partial 
Cholesky factorization of G in which we successively eliminate all the degree 1 and 2 vertices that are not 
endpoint of edges in S. We introduce the algorithm trim to define the order in which the vertices should be 
eliminated, and we call the trim order the order in which trim deletes vertices. 

Algorithm: tri.m{V,R,S) 

1 . While G contains a vertex of degree one that is not an endpoint of an edge in S, remove that vertex and 
its adjacent edge. 

2. While G contains a vertex of degree two that is not an endpoint of an edge in S, remove that vertex and 
its adjacent edges, and add an edge between its two neighbors. 

Proposition 1.1. The output of trim is a graph with at most 4 IS'I vertices and 5 edges. 

Remark 1.2. If (V, i?) and (y, S) are Gremban covers, then we can implement trim so that the output 
graph is also a Gremban cover Moreover, the genus and maximum size clique minor of the output graph do 
not increase. 

After performing partial Cholesky factorization of the vertices in the trim order, one obtains a factorization 
of the form 

B = LCL^, where C = 

L is lower triangular, and the left column and right columns in the above representations correspond to the 
eliminated and remaining vertices respectively. Moreover, \L\ < 2n — 1, and this Cholesky factorization may 
be performed in time 0{n + \S\). 

The following Lenrnia may be proved by induction. 

Lemma 1.3. Let B be a Laplacian matrix and let L and Ai be the matrices arising from the partial Cholesky 
factorization of B according to the trim order. Let U be the set of eUminated vertices, and let W be the set of 
remaining vertices. For each pair of vertices {a, b) in W joined by a simple path containing only vertices of 
U, let B^a.b) be the Laplacian of the graph containing just one edge between a and h of weight 1/ 1/wi), 
where the Wi are the weights on the path between a and b. Then, 

(a) the matrix A\ is the sum of the Laplacian of the induced graph on W and the sum all the Laplacians 

B{a,b), 



I 

Ai 



5 



(6) Pill < ||S||, A2(Ai) > X2{B), andso Hf{Ai) < Kf{B). 



Other topological structures may be exploited to produce elimination orderings that result in sparse L. In 
particular, Lipton, Rose and Tarjan |LRT79 1 prove that if the sparsity graph is planar, then one can find such 
an L with at most O(nlogn) non-zero entries in time 0(n^/^). In general, Lipton, Rose and Tarjan prove 
that if a graph can be dissected by a family of small separators, then L can be made sparse. The precise 
definition and theorem follow. 

Definition 1.4. A subset of vertices C of a graph G = (V, E) with n vertices is an f{n)-separator if\C\ < 
f{n), and the vertices ofV — C can be partitioned into two sets U and W such that there are no edges from 
U toW,and\U\,\W\ <2n/3. 

Definition 1.5. Let /() be a positive function. A graph G = (y,E) with n vertices has a family of f{)- 
separators if for every s < n, every subgraph G' ^ G with s vertices has a f{s)-separator 

Tlieorem 1.6 (Nested Dissection: Lipton-Rose-Tarjan). Let Abe an n by n symmetric PSD matrix, a > 
be a constant, and h(n) be a positive function of n. Let f{x) — h(n)x°'. If G{A) has a family of f{)- 
separator, then the Nested Dissection Algorithm of Lipton, Rose and Tarjan can, in O (n + time, 
factor A into A = LL"^ so that L has at most O {{h{n)n'^)'^ logn) non-zeros. 

To apply this theorem, we note that many families of graphs are known to have families of small separators. 
Gilbert, Hutchinson, and Tarjan [GHT84 | show that all graphs of n vertices with genus bounded by g have a 
family of 0(Y/5n)-separators, and Plotkin, Rao and Smith IPRS94J show that any graph that excludes Kg as 
minor has a family of 0{s\/n log n)-separators. 

1.5 Iterative Methods 

Iterative methods such as Chebyshev iteration and Conjugate Gradient solve systems such as Ax = b hy 
successively multiplying vectors by the matrix A, and then taking linear combinations of vectors that have 
been produced so far. The preconditioned versions of these iterative methods take as input another matrix 
B, called the preconditioner, and also perform the operation of solving linear systems in B. In this paper, 
we will restrict our attention to the preconditioned Chebyshev method as it is easier to understand the effect 
of imprecision in the solution of the systems in B on the method's output. In the non-recursive version of 
our algorithms, we will exploit the standard analysis of Chebyshev iteration (see lBru95J ). adapted to our 
situation: 

Tlieorem 1.7 (Preconditioned Chebyshev). Let A and B be Laplacian matrices, let b be a vector, and 
let X satisfy Ax = b. At each iteration, the preconditioned Chebyshev method multiplies one vector by A, 
solves one linear system in B, and performs a constant number of vector additions. At the kth iteration, the 
algorithm maintains a solution x satisfying 



In the non-recursive versions of our algorithms, we will pre-compute the Cholesky factorization of the pre- 
conditioners B, and use these to solve the linear systems encountered by preconditioned Chebyshev method. 
In the recursive versions, we will perform a partial Cholesky factorization of B, into a matrix of the form 
L[/, 0; 0, Ai\L'^ , construct a preconditioner for Ai, and again use the preconditioned Chebyshev method to 
solve - the systems in Ai . 
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2 Support Theory 



The essence of support theory is the realization that one can bound Xf(A, B) by constructing an embed- 
ding of A into B. We define a weighted embedding of A into i? to be a function tt that maps each edge 
e of A into a weighted simple path in B linking the endpoints of A. Formally, tt : x Eb H'*' 
is a weighted embedding if for all e e A, {f ^ B : 7r(e, /) > 0} is a simple path connecting from one 
endpoint of e to the other. We let path^ (e) denote this set of edges in this path in B. For e e A, we 
define wd^r (e) = X^jgpath (e) bf^(e /) • weighted congestion of an edge f G B under tt to be 

WC^ (/) = Ee:/epath,(e) (e) 7^(6, /). 

Our analysis of our preconditioners relies on the following extension of the support graph theory. 

Theorem 2.1 (Support Theorem). Let A be the Laplacian matrix of a weighted graph G and B be the 
Laplacian matrix of a subgraph F of G. Let n be a weighted embedding of G into F. Then 

Kf{A, B) < maxwcTT (/) ■ 

To understand this statement, the reader should first consider the case in which all the weights Og, 6/ and 
n{e, f) are 1 . In this case, the Support Theorem says that k j [A, B) is at most the maximum over edges / of 
the sum of the lengths of the paths through /. This improves upon the upper bound on k/ [A, B) stated by 
Vaidya and proved in Bern et al. of the maximum congestion times the maximum dilation, and it improves 
upon the bound proved by Boman and Hendrickson which was the sum of the dilations. This statement also 
extends the previous theories by using fractions of edges in B to route edges in A. That said, our proof of the 
Support Theorem owes a lot to the machinery developed by Boman and Hendrickson and our tt is analogous 
to their matrix M. 

We first recall the definition of the support of A in B, denoted a{A, B): 

(j{A, B) = min{T : Vt > r, tB ^ A] . 
Gremban proved that one can use support to characterize A / : 
Lemma 2.2. If Null {A) = Null {B), then 

Xf{A,B) =(T{A,B)a{B,A). 

Vaidya observed 

Lemma 2.3. If F is a subgraph of the weighted graph G, A is the Laplacian ofG and B is the Laplacian of 

F, then (t{B, A) < 1. 

Our proof of the Support Theorem will use the Splitting Lemma of Bern et. al. and the Rank-One Support 
Lemma of Boman-Hendrickson: 

Lemma 2.4 (Splitting Lemma). Let A = Ai + A2 + ■ ■ ■ + Ak and let B = Ai + B2 + ■ ■ ■ + B^. Then, 

a{A,B) < maxa{Ai,Bi). 

i 

For an edge e G A and a weighted embedding tt of A into B, we let Ae denote the Laplacian of the graph 
containing only the weighted edge e and Bf. denote the Laplacian of the graph containing the edges / e 
path^ (e) with weights afn{e, /). We have: 

Lemma 2.5 (Weighted Dilation). For an edge e e A, 

a{Ae,Be) = wdT, (e) . 
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Proof. Follows from Boman and Hendrickson's Rank-One Support Lemma. 



□ 



Proof of Theorem \2.1\ Lemma lZSl implies 



(T(Ae, wd^ (e) Be) = 1. 



We then have 



a{A, max wc^ (/) B) < 



cr(A, ^ wd^ (e) Be) 



< 



maxa{Ae, wdir (e) Be) 



< 



1, 



where the second-to-last inequality follows from the Splitting Lemma. 



□ 



3 The Preconditioner 

In this section, we construct and analyze our preconditioner 

Theorem 3.1. Let A be a Laplacian and G = (V, w) its corresponding weighted graph. Let G have n 
vertices and m edges. For any positive integer t < n, the algorithm precondition, described below, runs 
in 0(rn log m) time and outputs a spanning tree i? C E of G and a set of edges 5* C such that 

(!) ifB is the Laplacian corresponding toRUS, then af{A, B) < f 2°(^'°s"i°si°s"), and 
(2) \S\ < 0(t2 log n/ log log n). 

Moreover, if G has genus or has no Kg minor, then 

(2') IS*! < O (islogslogn/loglogn), 

and if G is the Gremban cover of such a graph, then the same bound holds and we can ensure that S is a 
Gremban cover as well. 

Proof. Everything except the statement concerning Gremban covers follows immediately from Theorem l2.1l 
and LemmasEllEll and lTT3l 

In the case that G is Gremban cover, we apply the algorithm precondition to the graph that it covers, but 
keeping all weights positive. We then set R and S to be both images of each edge output by the algorithm. 
Thus, the size of the set S is at most twice what it would otherwise be. 

For our purposes, the critical difference between these two graphs is that a cycle in the covered graph cor- 
responds in the Gremban cover to either two disjoint cycles or a double-traversal of that cycle. Altering the 
arguments to compensate for this change increases the bound of Lemma l3.10l bv at most a factor of 3, and the 
bound of Lemma l3.13l bv at most 9. □ 

The spanning tree R is built using an algorithm of Alon, Karp, Peleg, and West | AKPW951. The edges in 
the set S are constructed by using other information generated by this algorithm. In particular, the AKPW 
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algorithm builds its spanning tree by first building a spanning forest, then building a spanning forest over 
that forest, and so on. Our algorithm works by decomposing the trees in these forests, and then adding a 
representative edge between each set of vertices in the decomposed trees. 

Throughout this section, we assume without loss of generality that the maximum weight of an edge is 1 . 

3.1 The Alon-Karp-Peleg-West Tree 

We build our preconditioners by adding edges to the spanning trees constructed by Alon, Karp, Peleg and 
West I AKPW95 1 . In this subsection, we review their algorithm, state the properties we require of the trees it 
produces, and introduce the notation we need to define and analyze our preconditioner. 

The AKPW algorithm is run with the parameters x = 2\/'°s".iogiogn ^^^^ ^ ^ \^^^^' ^"'^ parameters 
fi = 9plogn and y — xfji are used in its analysis. 

We assume, without loss of generality, that the maximum weight edge in E has weight 1. The AKPW 
algorithm begins by partitioning the edge set E by weight as follows: 

E, - {e(,E:l/y' <w{e) < . 

For each edge e E E, let class (e) be the index such that e e i^ciassCe)- 

The AKPW algorithm iteratively applies a modification of an algorithm of Awerbuch IAwe85i . which we call 
cluster, whose relevant properties are summarized in the following lemma. 

Lemma 3.2 (Colored Awerbuch). There exists an algorithm with template 

F ~ cluster(G, x,Ei, . . . , Ek), 

where G = (V, E) is a graph, x is a number, Ei, . . . , Ek are disjoint subsets of E, and F is a spanning forest 
ofV, such that 

(1) each forest of F has depth at most 3xk log n, 

(2) far each I < i < k, the number of edges in class Ei between vertices in the same tree of F is at least 
X times the number of edges in class Ei between vertices in distinct trees of F, and 

(3) cluster runs in time 0(X]i l^iD- 

Proof. Properties (1) and (2) are established in the proof of Lemma 5.5 in | AKPW951. To justify the running 
time bound, we review the algorithm. We first recall that it only pays attention to edges in UiEi. The 
algorithm proceeds by growing a BFS tree level-by-level from a vertex that is not included in the current 
forest. It grows this tree until a level is reached at which condition (2) is satisfied. Once condition (2) is 
satisfied, it adds this tree to the forest, and begins to grow again from a vertex not currently in the forest. □ 

The other part of the AKPW algorithm is a subroutine with template 

G' = contract(G, F), 

that takes as input a graph G and a spanning forest F of G, and outputs the multigraph G" obtained by 
contracting the vertices of each tree in F to a single vertex. This contraction removes all resulting self-loops 
(which result from edges between vertices in the same tree), but keeps an image of each edge between distinct 
trees of F. The classes, weights, and names of the edges are preserved, so that each edge in G' can be mapped 
back to a unique pre-image in G. 

We can now state the AKPW algorithm: 



9 



Algorithm: R = AKPW(G') 

1. Set j = landG^^) = G. 

2. While G*^) has more than one vertex 

(a) Seti?^' = cluster {G'-J\ X, Ej_p+i, Ej). 

(b) SetG(^+i' = contract(G(^),i?J) 

(c) Set j = j + 1. 

3. Set i? = UjR^ 

The tree output by the AKPW algorithm is the union of the pre-images of the edges in forests . Our 
preconditioner will include these edges, and another set of edges S constructed using the forests . 

To facilitate the description and analysis of our algorithm, we define 

to be the forest on V formed from the union of the pre-images of edges in R^ U ■ ■ ■ U R^~^, 
to be the tree of F^ containing vertex v. 

Hf = Ej - Ei+\ and W = IJ.Hl 

We observe that F^+i is comprised of edges from Ei, . . . , Ej, and that each edge in has both endpoints 
in the same tree of 



Alon, et. al. prove: 



Lemma 3.3 (AKPW Lemma 5.4). The algorithm AKPW terminates. Moreover, for every i < j, 

/x< \E,\/x3-\ 



E 



< 



We remark that x^ > \E\, so for i < j — p, Ef — %. The following lemma follows from the proof of 
Lemma 5.5 of | !AKPW95il and the observation that yP > \E\. 

Lemma 3.4. For each simple path P in F^^^ and for each I, |P n < min(y-'~'+-'^, yP). 

3.2 Tree Decomposition 

Our preconditioner will construct the edge set S by decomposing the trees in the forests produced by the 
AKPW algorithm, and adding edges between the resulting sub-trees. In this section, we define the properties 
the decomposition must satisfy and describe the decomposition algorithm. 

Definition 3.5. For a tree T and a set of edges H between the vertices ofT, we define an H -decomposition 
ofT to be a pair (W, a) where W is a collection of subsets of the vertices ofT and a is a map from H into 
sets or pairs of sets in W satisfying 

L for each set W G W, the graph induced byTonW is connected, 

2. for each edge in T there is exactly one set W W containing that edge, and 
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3. for each edge in e £ H, if \a{e) \ = 1, then both endpoints of e lie in a{e); otherwise, one endpoint of 
e lies in one set in (T(e), and the other endpoint lies in the other 



We note that there can be sets W £ W containing just one vertex of T. 

For a weighted set of edges H and an i7-decomposition (W, a), we define the iJ-weight of a set G W by 



We also define WtotiH) J2e£H w{e). 

Our preconditioner will use an algorithm for computing small i/-decompositions in which each set G W 
with > 1 has bounded 7?- weight. 

Lemma 3.6 (Tree Decomposition). There exists an algorithm with template 

(yV, (t) = decompose(T, i?, (j)) 
that runs in time 0{\H\ + |T|) and outputs an H -decomposition (W, a) satisfying 

1. for all W eW such that \W\ > 1, wh{W) < (j), and 



Proof. We let T{v) denote the set of vertices in the subtree rooted at v, and for a set of vertices W, let 

H{W) = {e e H : ef] H ^ 9}. We then define w{v) =^ H(T{v)). Let i;o denote the root of the tree. 
Our algorithm will proceed as if it were computing w(wo) via a depth-first traversal of the tree, except that 
whenever it encounters a subtree of weight more than (j>/2, it will place nodes from that subtree into a set in 
W and remove them from the tree. There are three different cases which determine how the nodes are placed 
into the set and how cr is constructed. 

If, when processing a node v, the algorithm has traversed a subset of the children of v, {vi , . . . ,Vk} such that 
'w{vi) + • • • + 'w{vk) > 4'/'2, then a set W is created, all the nodes in {v} U*L]^ T{vi) are placed in W, and 
those nodes in u'f^iT{vi) are deleted from the tree. If a node v is encountered such that <j)/2 < H{T{v)) < (j>, 
then a set W is created, the nodes in T{v) are placed in W, and those nodes in W are deleted from the tree. 
In either case, for each node e £ H{W) we set a{e) = (j{e) U {W}. 

If a node v is encountered which is not handled by either of the preceeding cases and for which 'w{v) > (j>, 
then two sets Wi = T{v) and W2 = {v} are created, and those nodes in T{v) are deleted from the tree. For 
each edge e e H{v), W2 is added to (T(e) and for each edge e S H{T{v) — {f }), Wi is added to cr(e). 

Wlien the algorithm finally returns from examining the root, all the remaining nodes are placed in a final 
set, and this set is added to a{e) for each edge e E H with endpoints in this set. The algorithm maintains 
the invariant that whenever it returns from examining a node v, it has either deleted v, or removed enough 
vertices below v so that w{v) < (t>/2. To see that the algorithm produces at most Awtot/4' sets, we note that 
each edge in H can contribute its weight to at most two sets, and that every time the algorithm forms sets, it 
either forms one set with weight at least 0/2 or two sets with total weight at least (f). □ 

3.3 Constructing the Preconditioner 

We can now describe our algorithm for constructing the preconditioner We will defer a discussion of how to 
efficiently implement the algorithm to Lemma ITSi 

The algorithm will make use of the parameter 



w{e). 



2. \W\ < 4wtot{H)/(j). 
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Algorithm: (_R, S*) =Precondition(G') 

1. Run R = AKPW(G). Set h to the number of iterations taken by AKPW, and record R^,. . . and 
H\...,H^. 

2. For j = 1 to /i 

(a) let {Ti, . . . , Tfc} be the set of ti-ees in F^+i. 

(b) for « = 1 to fc 

i. let H be the subset of edges in Hj with endpoints in Ti 

ii. Set({VFi,...,l^,},(7)to 
decompose(Ti, 77, \E\ /t0^^'>) 

iii. for each ^ < i' < I, let a^.i, be the maximum weight edge in H between Wf^ and Wi,, and 
add a^,^ to S. 

Lemma 3.7. Let S be the set of edges produced by Precondition. Then, 

\S\ < 8ph^ = O (t^ log n/ log log n) . 
Moreover, if G has no Kg minor, then \S\ — O (ts logs log n/ log log n) . 

Proof. Let Cj be the total number of sets produced by applying decompose to the trees in F^^^ . We first 
bound V • Cj. We have 

j j 3 i 

To bound this sum, we set 

) if j < i, 

h' = I T.l<^\Hl\ if^ = J, and 



if j > i- 



We observe that Lemma l331 implies < \Ei\ jx^ *, and hj — for j > i + p. As is increasing, we 
have 

j i j i=j-p+l 

i j=i 

■i 

< \E\p, 

as O^^'' < x^^^y^^^ for j < i + p — I. Thus, Cj < Apt, and, because we add at most one edge between 
each pair of these sets, we have \ S\ < 8p^t^. 
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As observed by Vaidya, a result of Mader IBol78l implies that if a graph does not have a complete graph on 
s vertices as a minor, then the average degree of every minor of G is 0{s log s). Hence, the number of edges 
added to S at iteration j is at most CjS log s, and so 




Finally, a graph of genus s 



does not have a K, 



e(s) mmor. 



□ 



Using the dynamic trees data structure of Sleator and Tarjan IST83I . we prove: 

Lemma 3.8. If G is a graph with n vertices and m edges, then the output of precondition can be 
produced in 0(m log m) time. 

Proof. We first observe that AKP W can be implemented to run in time 0{m log m), as each edge appears in at 
most p — O(logTO) calls to coloredAwerbuch, and the contractions can be implemented using standard 
techniques to have amortized complexity 0(log m) per node. 

As j could be large, it could be impractical for the preconditioning algorithm to actually examine the entire 
forest for each j. To overcome this obstacle, we observe that the determination of which edges a^^^, 
to include in S only depends upon the projection of the sets in the decompositions onto vertices at endpoints 
of edges in . That is, rather than passing (T-' , H) to decompose, it suffices to pass the topological tree 
induced by restricting to vertices with endpoints in H {i.e., with non-essential degree 2 nodes removed). 
As this tree has size at most 0{\H\), we can implement the algorithm in linear time plus the time required 
to produce these trees. There are many data structures that allow one to dynamically add edges to a tree and, 
for any set of vertices in the tree, to produce the induced tree on all least common ancestors of those vertices. 
For example, one can do this if one can determine (?) the nearest common ancestor of any pair of vertices, 
and [a) which of a pair of vertices comes first in an in-order. The dymanic trees of Sleator and Tarjan | ST83 1 
enable edge additions and nearest common ancestor queries at an amortized cost of 0(log7T.) each, and any 
algorithm that balances search trees using tree rotations, such as red-black trees, enables one to determine 
relative order of nodes in an in-order at a cost of 0(log n) per addition and queiTy. □ 

3.4 Analyzing the Preconditioner 

We will use weighted embeddings of edges into paths in i? U S* to bound the quality of our preconditioners. 
The weights will be determined by a function T{j,l), which we now define to be 



For each edge e G and each edge / e path^ (e), we will set 7r(e, /) = T{j, class (/)). We will construct 
TT so as to guarantee class (e) < class (/) + p. 

It remains to define the paths over which edges are embedded. For an edge e = {u, v) in , if e e i? U S" 
then we set path^ (e) — e and 7r(e, e) = 1. Otherwise, we let T be the tree in F^+i containing the endpoints 
of e and let a be the function output by decompose on input T. If |cr(e)| = 1, then we let path^ (e) be 
the simple path in T connecting the endpoints of e. Otherwise, we let {W^,, Wf^} = a{e) and let Oi/.^ be the 
edge added between Wi, and Wf^. We then let path^ (e) be the concatenation of the simple path in T from u 
to a^_^, the edge a^^^^ and the simple path in T from a^^^^ to v. 

The two properties that we require of r are encapsulated in the following lemma. 




Lemma 3.9. (a) For all j > 1, J2i-. 



1=1 




{p + 2), and 
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(b) For all I >\, j:j>ir{j,l) < {p + I). 



Proof. The first property follows from 



^ r{j, I) ^ {j-l + p+l)^ ^ ^ ^ 

i=\ ' /=i ^ ' i=j-p+i 

j-p ,-+1 J 

= y tl + y y^+^ 

The second property follows from J2i>i ^^J/ ' 1^ 1' which holds because y is greater than the real root of 
yS _ 42^2 ^2y -1, which is about 3.51155. □ 

We now derive the upper bound we need on the maximum weighted congestion of the embedding tt. 
Lemma 3.10. For each j and each simple path P in F^^^, 



Proof. 



^^w{f)r{j,class{f))-^P^^^y'^'- 



y i <y y 1 

f^P wifHj, class (/)) - ^ ^^^^^ wif)T{j, I) 



1=1 

j 

1=1 

<i/+Hp + 2) 



where the third- to-last inequality follows from Lemma|3j4| the second-to-last inequality follows from f £ Ei, 
and the last inequality follows from Lemma ll!9l (a). □ 

Lemma 3.11. For each edge e ^ E, 

wd^ (e) < {2p + b)y^+^w{e). 

Proof. Let e G Hi, let T be the forest in F^+i containing the endpoints of e, and let (W, a) be the output of 
decompose on input T. If |cr(e)| = 1, the e is routed over the simple path in T connecting its endpoints, so 
we can apply Lemma l3.10l to show 

wd^ (e) < (p + 2)?/-'+^w(e). 

Otherwise, let cr(e) — {W^, W^}, and observe that path^ (e) contains two simple paths in T and the edge 
a■u,^^■ Applying Lemma l3.10l to each of these paths and recalling class (aj^ < j, which implies w{ai,,fj_) > 
1/y-', we obtain 

wd^ (e) < 2{p + 2)y^+'^w{e) + y^w{e) < {2p + 5)y^+^w{e). 

□ 
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Lemma 3.12. For each f Cz RU S and for each j 



eeHi:fepath^{e) 

Proof. Let T be the tree in F^+i containing the endpoints of /, and let (W, cr) be the output of decompose 
on input T. There are two cases two consider: / can either be an edge of T, or / can be one of the edges 
0.1^, fj.- If / is an edge of T, let W be the set in W containing its endpoints. Otherwise, if / is one of the edges 
a^^fj,, let W be the larger of the sets Wi, or W^. If \W^\ — \ Wfj,\ = 1, then the only edge having / in its path 
is / itself, in which case the lemma is trivial. So, we may assume \ W\ > 1. In either case, each edge e for 
which / e path^ (e) must have W E cr(e). Thus, 

< (2p + 5)y^+i \E\/te'-^^ 

<i2p + 5)y^^iP\E\/t. □ 
Lemma 3.13. Let R, S and tt be constructed as above. Then, 

max wc^ if) ^ !!!20(Vi°g"i°gi°g«). 

Proof. For any edge / e i? U S*, we let I ~ class (/) and compute 

WCtt (/) = ("^^ 
ee-E:/epath^(e) 

= wd^(e)T(j,0 

j eG-ffJ:/epath^(e) 

< ^r(j- 0(2p + 5)AiV|£^IA 

j 

< {p+l){2p+5)fify^\E\/t, 

_ 2C'(\/lognlog logn) 

where the second-to-last inequality follows from Lemma 13.121 the last inequality follows from Lemma IT9l 
(&), and the last equahty follows from = 2°(^'°s"'°s'°g"). □ 



4 One-Shot Algorithms 

Our first algorithm constructs a preconditioner B for the matrix A, performs a partial Cholesky factorization 
of B by eliminating the vertices in trim order to obtain B — L[1, 0; 0, AilL"^, performs a further Cholesky 
factorization of Ai into LiLj, and applies the preconditioned Chebyshev algorithm. In each iteration of 
the preconditioned Chebyshev algorithm, we solve the linear systems in B by back-substitution through the 
Cholesky factorizations. 
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Theorem 4.1 (One-Shot). Let A be an n-by-n PSDDD matrix with m non-zero entries. Using a sin- 
gle application of our preconditioner, one can solve the system Ax = b to relative accuracy e in time 
O ^771^^/^'^'''°^^^ log(K(A)/e)). Moreover, if if the sparsity graph of A does not contain a minor isomorphic 
to the complete graph on mP vertices, or if it has genus at most m?^ , for 9 < 1/3, then the exponent of m 
can be reduced to 1.125(1 + 6) + o(l). 

Proof. The time taken by the algorithm is the sum of the time required to compute the preconditioner, perform 
the partial Cholesky factorization of B, pre-process Ai (either performing Cholesky factorization or inverting 
it), and the product of the number of iterations and the time required per iteration. In each case, we will set 
t = rrP for some constant 7, and note that the number of iterations will be , and that the matrix 

Ai will depend on rn' . 

If we do not assume that A has special topological structure, then Ai is a matrix on ?7i^'''+°(^) vertices. If 
we solve systems in Ai by Cholesky factorization, then it will take time O (m + m^^~^°^^^^ to perform the 
factorization and time O (m + m^'''+°(^^) to solve each system. So, the total time will be m*^^^'''-'/^"'"'''''"'"°*-^^ + 
j^67+o(i) Sgffijjg ^ — 3/13, we obtain the first result. 

If the graph has genus 0^ or does not have a K^n" minor, or is the Gremban cover of such a graph, then 
can apply part (2') of Theorem 13.11 Thus, Ai is a matrix on m'''+^+°'^^^ vertices. In the Gremban cover 
case, the preconditioner is a Gremban cover, and so the partial Cholesky factorization can ensure that Ai 
is a Gremban cover as well. As the Gremban cover of a graph has a similar family of separators to the 
graph it covers, in either case we can apply the algorithm of Lipton, Rose and Tarjan to obtain the Choleksy 
factorization of Ai. By Theorem ll.6l with a ~ 1/2 + 6', the time required to perform the factorization will 
be O (m + m''''^'^^+'^/^^+°(^^), and the time required to solve the system will be 

O (m(l-^)/2(m + ,n7(2fl+l)+o(l)^ ^ Q |^^^(i_^)/2+l+o(l)j ^ 

provided 7(20 +1) < 1. We will obtain the desired result by setting 7 = (3 — 96') /4. □ 



5 Recursive Algorithms 

We now show how to apply our algorithm recursively to improve upon the running time of the algorithm 
presented in Theorem l4.1l 

For numerical reasons, we will use partial LDL^-factorization in this section instead of partial Cholesky 
factorizations. We remind the reader that the LL'L^-factorization of a matrix B is comprised of a lower- 
triangular matrix L with ones on the diagonal, and a diagonal matrix D. The partial LDL^ factorization of 
a matrix Bi has the form 

where D is diagonal L has the form 




and Li 1 has Is on the diagonal. 

The recursive algorithm is quite straightforward: it first constructs the top-level preconditioner Bi for matrix 
Aq = A. It then eliminates to vertices of Bi in the trim order to obtain the partial LDL^-factorization 
Bi = LiCiLJ, where Ci = [Di, 0; 0, Ai]. When an iteration of the preconditioned Chebyshev algorithm 
needs to solve a linear system in Bi, we use forward- and backward-substitution to solve the systems in ii 
and Lf, but recursively apply our algorithm to solve the linear system in Ai. 
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We will use a recursion of depth r, a constant to be determined later. We let Aq = A denote the initial matrix. 
We let denote the preconditioner for Ai, LiCiLf be the partial LDL^ factorization of Bi in trim order, 
and Ci = [Di,0; 0, AJ. To analyze the algorithm, we must determine the relative error to which we will 
solve the systems in Ai. The bound we apply is derived from the following lemma, which we derive from a 
result of Golub and Overton ifGOSSl . 

Lemma 5.1 (Preconditioned Inexact Chebyshev Method). Let A and B be Laplacian matrices satisfying 
(j{B, A) > 1. Let X be the solution to Ax = b. If, in each iteration of the preconditioned Chebyshev Method, 
a vector Zk is returned satisfying 

Bzk = rk + Qk, where \\q,.\\ < S\\rk\\, 
where S < (^128 ^/kJ{B) a {A, B)j , then the k-th iterate, Xk, output by the algorithm will satisfy 

\\x - Xk\\ < 6 • 2-''^v^''f^'^'^'>Kf{A)^Kf{B) \\x\\ . 



Our main theorem is: 

Theorem 5.2 (Recursive). Let A be an n-by-n PSDDD matrix with m non-zero entries. Using the recursive 
algorithm, one can solve the system Ax = b to relative accuracy e in time 

O (mi '''i+°(i)(log(e-i)log(nK(A)))°(i)) . 

Moreover, if the graph of A does not contain a minor isomorphic to the complete graph on vertices, or 
has genus at most m?^ , or is the Gremban cover of such a graph, then the exponent ofm can be reduced to 
l + 59 + o{l). 



We note that if G{A) is planar, then the algorithm take time nearly linear in m. 

The following two lemmas allow us to bound the accuracy of the solutions to systems in Bi in terms of the 
accuracy of the solutions to the corresponding systems in Ai. 

Lemma 5.3. Let LCL^ be a partial LDL^ -decomposition of a symmetric diagonally dominant matrix. 
Then, 

k{L) < 271^/2. 

Proof. As L is column diagonally-dominant and has Is on its diagonal, \\L\\^ < 2; so, < 2y^. By a 
result of Malvshev IMalOOl Lemma 11. < n (also see Pefia |Pefi98|). □ 

Lemma 5.4. Let B be a Laplacian matrix, let LCL'^ be the partial LD -factorization obtained by elimi- 
nating vertices of B in the trim order Then, k{C) < k{B). 



Proof. We recall that C has form 

■ D 
Ai 

The factor Ai is identical to that obtained from partial Cholesky factorization, so k{Ai) < k{B) follows 
from Lemma lOl To now bound k(C), we need merely show that each entry of D lies between the smallest 
and largest non-zero eigenvalues of B. This follows from the facts that the ith diagonal of D equals the value 
of the diagonal of the corresponding vertex in the lower factor right before it is eliminated, this value lies 
between the smallest and largest non-zero elements of the corresponding factor, and by Lemma [131 these lie 
between the largest and smallest non-zero eigenvalues of B. □ 
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Lemma 5.5. Let B be a Laplacian matrix and let LCL"^ be the partial LDL^ -factorization obtained by 
eliminating vertices of B in the trim order. For any c € Span (B), let s be the solution to Cs = L^^c and 
let s satisfy ||s — s|| < e ||s||. Let y be the solution to L^y = s. Then 

\\c- By\\ < eK{L)n{C)\\c\\ . 



Proof of Lemma \53\ First, note that c — By = LC{s — s) and c ~ LCs. Moreover, L must lie in 
Span(C). Thus, ||Cs|| > A2(C) ||s||, and so 

\\C{s^s)\\<^>^fiC)\\Cs\\. 

As L is non-degenerate, we may conclude 

\\LC{s - s)\\ < en{L)Kf{C) \\LCs\\ . 

□ 



Proof of Theorem \5.2\ For Aq, . . . , A^, Bi, . . . , Br, Ci, . . . , Cr, and Li, . . . ,Lr as defined above, we can 
apply Lemma l5r4l and Theorem B.ll to show: 

• Hf{A,) < Kf{Bi) < m*(i+°(i))K/(A), 

• Kf{B,) < mi+°(i)K/(A,_i) < to'(i+°(i))k/(A) 

In the recursive algorithm we will solve systems in Ai, for i > 1, to accuracy 

e,; - (l28m'(i+°(i))(2n3/2^(A)))"\ 

By Lemma l531 and the above bounds, we then obtain solutions to the systems in Bi to sufficient accuracy to 
apply Lemma ism 

Let rrii be the number of edges of Ai. When constructing the preconditioner, we set ti = {nii)^, for 
a 7 to be chosen later Thus, by Theorem 13. II and Proposition ll.il < m^^'''^', and Kf{Ai, Bi-^^i) = 

^(27)'(l-7)+o(l)^ 

We now prove by induction that the running time of the algorithm obtained from a depth r recursion is 

O (7rA+°(i) {rlog{nK{A)f^ , where 

/3/= (V^)E(27r^ + 2(27r, 

i=l 

and 7 =^ (3 — ■\/5)/2. In the hmit, /3r approaches Poo '= (3 + '\/5)/4 from above. The base case, r = 1, 
follows from Theorem l4.1l 

The preprocessing time is negligible as the partial Cholesky factorizations used to produce the Q take linear 
time, and the full Cholesky factorization is only performed on A^. 

Thus, the running time is bounded by the iterations. The induction follows by observing that the iteration time 
is TO^+°(^) (^m + mf''^^^ log (K(Ar)K(_Br)/er), which proves the inductive hypothesis because mf''"^ > 
TO. As 1.31 > /3oo, there exists an r for which Pr < 1.31. 

When the graph of A does not contain a K„itt minor or has genus at most to^^, we apply a similar analysis. 
In this case, we have m.i < rai^imP^°^^\ Otherwise, our proof is similar, except that we set 7 = (3 — — 
^/lTWTW)l2, and obtain /J^^ = (3 + 6* + ^/lTW+W)/A, and note that /3oo < 1 + 50. □ 
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